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1  •  Introduction. 


From  the  data  collected  in  a  study  (  1  )  on  the  journal  reading  habits 
of  physicists  and  chemists,  it  can  be  estimated  that  if  Journals  of  a  given 


discipline  (e.g.  chemistry)  are  ranked  according  to  decreasing  frequency  of 


being  read,  the  probability  p^  with  which  a  journal  of  rank  r  is  read  varies 


1 


approximately  as  a  Yule  distribution  (  2  ),  i.e.,  P,  ^  being  approxi- 

i 


O  OQ 


mately  one  in  this  case.  The  salient  characteristic  of  the  Yule  distribution 
is  its  long  tail  (slowly  decreasing  for  increasing  r  ).  Thus,  while  10  journals 
account  for  50%  of  the  reading,  the  rennaining  amount  of  reading  is  spread  over 
a  large  number  of  different  journals.  While  the  few  most  frequently  read 
Journals  are  probably  read  by  everyone  in  the  same  general  field  of  interest. 


the  remaining  journals  differ  a  great  deal  from  person  to  person  depending 


>atrengly‘"Dn  the  special  inteMst  of  the*  reader,  ^n  the  face  of  rapid  increases 
in  the  number  of  published  journals,  mostly  in  areas  of  high  specialization,  it 
is  becoming  increasingly  difficult  for  an  individual  to  discover  items  of  poten- 
tial  value  without  an  enormously  increased  reading  load*  The  situation  is  even 


worse  for  items  in  unexpected  or  unknown  sources.  The  discovery  of  these 

<rr,. 


'rare'^  items  of  interest  is  the  problem  to  which  the  present  system  is  addressed 


NO  OTS 


To  a  certain  extent  the  problem  of  finding  rare  items  is  automatically 
alleviated  in  practice  by  extensive  information  exchanges  among  scientists 
with  similar  interests.  Similarity  in  interests  leads  to  ^'cliques''  or  ''clusters"^ 
within  which  channels  for  efficient  information  exchange  exist.  The  primary 
goal  of  the  system  being  considered  here  is  to  discover,  formalize,  and  utilize 
these  clusters  for  the  purpose  of  increasing  the  likelihood  of  discovering  a 
rare  item  of  information  for  an  individual.  -  -  ^  ^  ^ 

The  effectiveness  of  such  a  system  can  be  estimated  from  the  following 
considerations:  assume  that  n  scientists  of  the  same  interest-cluster  consult 
the  infrequently  read  journals  in  a  statistically  independent  manner.  Let  q 
be  the  probability  of  finding  an  item  of  interest  in  these  unusual  sources  upon 
consulting  them.  Let  p  be  the  total  probability  of  consulting  such  journals 
for  each  person.  In  a  cluster  where  such  a  discovery  by  one  member  would 
mean  automatic  discovery  by  all  the  members,  the  probability  of  a  member 
discovering  a  rare  item  is  1  -  (1  -  qp)”  or  about  1  -  ,  as  compared 

with  qp  for  an  isolated  individual. 

In  practice,  even  better  performance  can  be  expected.  The  probability 
of  discovery  on  an  individual  basis,  i.  e. ,  pq  ,  varies  from  person  to  person. 
There  usually  exists  one  or  more  members  in  any  cluster  who  is  much  better 
informed  than  others.  The  system  of  exchange  within  a  cluster  has  the  effect 
of  raising  every  member  to  at  least  the  same  degree  of  informedness  as  that  of 
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^  ^  Con4»ared  to  a  system  where  information  is  disseminated  from  a 
central  source*  this  system  of  exchange  through  clustering  has  several  dis- 
tinct  attractive  features*  First*  since  items  distributed  are  ''rare"  items* 
the  amount  of  information  of  minor  interest  or  no  interest  at  all  to  a  par¬ 
ticipant  is  kept  to  a  minimum.  Secondly*  since  this  is  a  system  of  exchange* 
the  task  of  administering  such  a  program  is  kept  simple, 

The  eemaindee  ef  the  paper  describes  the  design  of  the  system* 
formation  of  clusters*  and  analysis  of  some  preliminary  performance  data. 

2.  The  Experimental  System. 


\ 


In  the  experimental  system  to  be  described  here*  people  are  grouped 
according  to  the  similarities  of  thdr  reading  interests.  This  is*  of  course* 
only  one  of  several  possible  relevant  criteria  for  grouping  interests  besides 
what  is  read:  what  courses  were  best  liked*  what  papers  were  written*  the 
responses  to  keywords*  etc.  Reading  was  chosen  here  because  it  was  par¬ 
ticularly  simple  to  obtain  data. for*  using  a  variant  of  a  procedure  used  by 
King  and  Tanimoto.  ^  (3) 


The  library*  acting  as  a  central  message  exchange*  was  supplied 
with  a  distribution  list  for  each  participant*  listing  all  the  other  participants  with 


♦ 

They  presented  20  respondents  with  the  Table  of  Contents  of  a  few  selected 
journals*  and  asked  them  to  check  those  they  would  read. 


very  similar  reading  interests.  Any  participant  can  inject  items  of 
information  into  the  system  by  submitting  to  the  library  by  telephone  or  in 
writing.  The  item  might  be:  a  complete  or  partial  description  of  a  particu¬ 
larly  recommendable  article  udiich  others  are  not  likely  to  have  come  across; 
a  technical  question  on  which  help  from  someone  who  might  be  uniquely  quali¬ 
fied  to  help*  is  solicited;  an  idea  for  an  experimentt  a  device,  or  a  theoretical 
study  on  which  reactions  or  comments  are  desired;  a  new  finding  to  be  announced 
to  those  interested;  etc.  If  the  originator  submits  it  in  writing,  he  records  it 
on  a  special  card,  which  has  no  rigid  format  except  that  it  classifies  the  nature 
of  the  entry,  and  sends  it  to  the  library.  If  he  telephones,  the  library  prepares 
this  card. 

On  receipt  of  such  an  item,  the  library  duplicates  this  as  many  times 
as  there  are  members  on  the  sender's  distribution  list,  and  disseminates  it  to 
them.  In  order  to  monitor  the  recipients'  responses  for  experimental  purposes, 
a  recipient  receives,  together  with  the  duplicated  entry,  a  simple  response  card, 
similar  to  that  ised  in  the  SOI  system.  (  4  )  On  this,  he  indicates  whether  the 
item  of  information  he  received  was  of  interest  and/or  new,  and  he  sends  the 
card  for  analysis. 

This  system  shares  with  the  SOI  (Selective  Dissemination  of  Information) 
the  feature  of  trying  to  supply  Information  about  the  scientific  literature  which 
would  otherwise  not  be  readily  available  to  participants.  Because  the  SOI 
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Byttem  depends  on  a  central  source  for  scanning,  selecting,  and  abstracting 
the  literature,  the  amount  and  quality  of  the  service  depends  only  on  this 
source,  not  on  the  number  or  level  of  the  participants.  In  the  system  des¬ 
cribed  here,  the  latter  situation  is  the  case,  and,  as  pointed  out  above,  the 
quality  of  service  can  be  made  very  high  by  narrowly  restricting  membership 
in  an  interest  cluster  and  increasing  the  number  of  participants. 

In  the  future,  the  grouping  of  participants  will  have  to  be  revised  and 
checked  periodically  to  take  into  account  shifting  interests,  additional  partici¬ 
pants,  quality  and  quaitity  of  service.  The  data  for  this  arises  from  the  re¬ 
sponses  which  recipients  of  information  feed  back  into  the  system  on  a  con¬ 
tinuing  basis.  The  "system"  thus  has  two  major  functions,  which  may  even¬ 
tually  be  automatic: 

(1)  transmission,  duplication  and  routing  of  information; 

(2)  continually  sensing  the  state  of  the  system  and  using  this 
information  to  control  its  growth  and  operation. 

3.  Design  of  the  Experiment. 

To  nucleate  the  system,  it  was  decided  to  solicit  participation  from 
only  certain  members  of  IBM  Research  who  were  willing  to  take  considerable 
initiative  in  contributing  to  its  success.  On  the  basis  of  a  letter  which  was 
circulated  to  over  100  staff  members  of  IBM  Research,  explaining  the 
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propoaed  ayatem.  thirty  volunteered  to  participate,  aa  a  atart.  Thia 
group  ahould  not  be  regarded  aa  a  aantple  from  which  to  draw  aubatantive 
concloaioaa  about  the  applicability  to  other  groupa,  but  thia  waa  not  our 
goal.  The  purpoae  of  thia  atudy  waa  to  demonatrate  that  there  exiata  a 
ayatem  which  would  enable  ita  membera  to  obtain  greater  acceaa  to  the 
literature  with  relatively  little  effort.  To  teat  the  profeaalonal  reading 
intereata  of  theae  reapondenta  (the  network  of  people  and  aet  of  procedures), 
the  following  crude  method  waa  uaed  while  an  improved  teating  procedure 
ia  under  development. 

A  random  sample  of  200  articles,  represented  by  title  and  author 
only,  was  selected  from  the  winter  1961  issues  of  about  450  English-language 
technical  journals  available  in  the  library  of  the  IBM  Research  Center.  Each 
of  the  30  respondents  was  asked  to  indicate,  on  a  four-point  ordered  scale, 
to  what  extent  he  would  be  interested  in  the  article  on  the  basis  of  title-author. 
This  test  was  administered  through  interviewing,  along  with  a  number  of  open- 
response  questions  designed  to  further  characterise  the  respondents'  profes¬ 
sional  interests,  usage  of  the  literature,  and  information  needs. 

In  the  analysis  to  be  described,  the  responses  were  grouped  into  two 
categories,  distinguishing  no  interest  at  all  from  its  opposite:  that  is,  the 
three  response  categories  indicating  degrees  of  positive  interest  were  lumped 
together.  This  was  done  only  to  keep  the  consequent  computations  within 
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reaaonahl*  bounds.  The  data  was  summarised  in  a  table  listing  the 
respmidoids  as  row  headings,  and  the  articles  in  the  sample  as  column 
headings.  If  a  particular  respondent  expressed  interest  in  a  specific  article, 
a  1  was  entered  in  the  cell  corresponding  to  the  appropriate  row  and  column; 
for  no  Interest,  nothing  was  recorded,  and  it  was  treated  as  a  0  entry.  Had 
the  four-point  scale  been  used,  each  article  would  be  allowed  three  columns, 
representing  the  three  categories  of  positive  interest;  a  0  or  1  would  again 
be  entered  into  the  appropriate  cell,  and  the  analysis  would  proceed  exactly 
as  described,  except  that  a  30  x  600  rather  than  a  30  x  200  table  must  be 
dealt  with. 


Inasmuch  as  the  procedures  for  testing,  sampling  and  validating 
statistical  inferences  are  still  under  development,  the  detailed  methodological 
considerations  will  be  deferred  to  a  later  paper. 


4.  Clustering  Analysis. 


(a)  Measure  of  Similarity: 


Let  r^^  be  the  response  of  the  i^^  person  to  the  article 


such  that 


ik 


«  [l,  if 

•  1o,  if 


interested, 

not, 


i  »  1,  .  .  ,  n 
j  -  1.  .  .  .  N. 


(1) 
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Here,  n  is  the  number  of  people  (30  at  the  time  of  this  report)  and  N  the 
number  of  articles  used  to  test  them  (200  in  the  first  trial).  Let  R  be  the 
matrix  with  elements  r^^  .  Define  the  matrix  C 

C  -  RR  .  (2) 


where  R  denotes  the  transpose  of  R  .  A  typical  element  c^^  of  C 
the  number  of  articles  in  which  the  interests  of  i  and  j  co-occur. 


represents 


The  similarity,  or  association  factor,  s^^  between  two  people  i  and 
j  is  defined  as 


“ij 


I 


N  X 


®ii 


(3) 


A  number  of  other  measures  of  association  has  been  used,  and  five  of  these 
are  listed  in  Table  1  for  comparison. 


The  definition  of  Stiles  (  5  )  is  a  form  of  the  chi-square  formula  on 
a  2x2  contingency  table  and  includes  the  Yates'  correction. (6) 


King  and  Tanimoto  (  3  )  used  two  different  measures.  The  first  of 


these  s^  is  called  the  similarity  measure,  the  second,  d^  ,  distance 
measure,  which  is  simply  the  negative  log  of  s^.  .  Our  own  definition  was 


derived  from  the  following  considerations. 


Author 


Association  Factor 


Stiles  (5) 


|c..N  - 

Ijj _ “  JJ  I  2 


C..C..  (N-c..)(N-c..) 
11  JJ  11  JJ 


Baxendale  (7) 


King-Tanimoto  (3) 


■  «-  ,  d.,  *  -log  s.. 

-c..) 

11  JJ  ij 


Luhn- Savage'*'  (4) 


Kochen-Wong 


c  c.. 
ii  JJ 


Table  1  -  A  Comparison  of  Association  Factors 

Assume  that  person  1  responds  favorably  to  an  article  with  probability 
p^  ,  and  that  responses  to  successive  articles  are  independent  (the  indepen¬ 
dence  of  successive  responses  is  a  problem  involved  in  sampling).  If  the 


frequency  ratios 


are  taken  to  be  the  estimates  of  p^  ,  the  mean  number 


of  coincidences  between  i  and  J  is  given  by: 


Unlike  the  other  measures,  the  Luhn-Savage  definition  measures  associa¬ 
tion  between  different  entities  (documents  and  people),  and  is  not  symmetric, 
i.e.,  Sj^j  ^  Sjj  • 
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N 


(4) 


Our  muasur*  of  asaociation  is  now  defined  as  the  ratio  of  the  actual  colncl 
dence  over  the  coincidence  expected  on  the  basis  of  independence. 


ij 


(5) 


In  addition  to  computational  ease,  this  definition  has  the  advantage  of  pos — 
sessing  a  simple  intuitive  interpretation.  For  example,  for  ■  5  on  .e 

can  say  that  the  actual  coincidence  between  1  and  j  is  five  times  what  ias 
expected  on  the  basis  of  sero-association  (independence).  Values  of  s^ 
greater  than  1  indicate  positive  association,  1  >  aero  association), 

and  Sjj  <  1  negative  association,  or  dissociation.  (The  quantity  log  j 

reflects  these  properties  directly,  but  distorts  the  scale  in  an  undesirable 
manner. )  When  the  cluster  finding  procedure  is  fully  progranuned,  it 
would  be  desirable  to  use  a  statistically  mare  satisfactory  definition,  like 


■ij 


°Ji'] 

'll  'J1  ■  'u'  -  'jj> 


It) 


This  is  closely  related  to  that  of  Stiles,  and  is  based  on  the  2x2  contingg^^cy 


table  shown  in  Table  2. 
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Person  i 

ncA  interestad 

c 

0 

% 

m 

e 

m 

m 

c..  -  c.. 
jj  y 

■HHi 

c. . 

JJ 

u 

e 

CU 

u 

e 

•S 

o 

Z 

0-4  -  C.. 

n  ij 

N  -  c. .  -  c. .  -  c. . 
ij  “  JJ 

w 

total 

c.. 

ll 

N  -  c.. 

11 

N 

Table  I 


A  particular  advantage  of  this  definition  is  that  it  can  be  extended  easily  to 
deal  with  multiple -point  scale  response,  e.  g.  (not  interested,  interested, 
very  interested,  and  vital).  In  such  cases  one  merely  has  to  expand  the  con¬ 
tingency  table  and  use  a  general  version  of  this  formula  as  a  measure  of 
association  (8), 

(b)  Definition  of  Cluster: 

A  set  C  of  k  people  is  said  to  form  a  cluster  relative  to 
threshold  6  il  for  every  i,  j  in  C, 


The  quantity  C  is  a  parameter  to  be  chosen  a  priori,  and  determines 
the  "strength"  of  the  clusters  formed.  In  general,  an  increase  in  C  will 
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cause  smaller  and  more  "closely-knit"  clusters  to  be  formed. 

The  choice  for  a  suitable  definition  of  "cluster"  poses  a  difficult 
problem  ediich  has  occurred  in  a  wide  variety  of  applications  (9>  10,  11,  12). 
Several  definitions  of  "cluster"  have  been  proposed  (12,  13,  14,  15).  The  final 
choice  adopted  here  was  conservative  in  that  it  is  required  that  in  every  "cluster" 
the  association  of  reading  interest  between  any  two  people  must  equal  or  exceed 
a  certain  minimum  level.  This  choice  of  a  "narrow  cluster"  definition  was 
designed  to  insure  that  the  clusters  be  homogeneous  and  closely-knit  groups 
at  the  risk  of  leaving  out  people  who  may  properly  belong  to  clusters.  The 
point  to  be  emphasized  here  is  that  this  concept  of  two  individuals  with  a  high 
degree  of  association  may  belong  to  different  clusters  by  virtue  of  their  associ¬ 
ations  with  other  people. 

Thus  the  problem  of  finding  clusters  can  be  stated  as  follows: 


neci 

C. 


Given  a  collection  of  people,  find  sets  C,,  C_,  .  .  ,  C  (not 

c  m 


ily  disjoint),  such  that  s 


IJ 


s  for  i  and  j  belonging  to  the  same 


The  clusters  thus  defined  may  not  be  unique.  Furthermore,  there 
exists  no  known  schemes,  other  than  exhaustive  ones,  for  finding  the  clusters. 
Therefore,  it  is  of  considerable  practical  importance  for  an  algorithm  to  be 
developed  for  forming  the  clusters  to  be  derived.  An  algorithm  based,  in  part, 
on  the  kind  of  heuristic  devices  used  by  people  in  extracting  clusters  from  the 
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data  to  accomplish  this  has  been  developed  and  will  be  described  in  the  next 
section. 

(c)  Algorithm  for  Cluster-Formation: 

The  algorithm  can  best  be  described  by  an  illustrative  exan^>le. 
The  association  matrix  for  this  example  is  shown  in  Figure  1,  where  the  diagonal 
terms  are  omitted. 
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In  the  procedure  outlined  below,  a  second  paremeter,  €  '  •  C  '  >  £  ia 

used.  For  this  exan^tle,  £  *  3  and  C*  s  4  . 


Procedure 

Exanmde 

Step  1.  For  each  i,  starting  with 
i  ■  1,  find  all  the  j's  for  which 

i  -  1.  j  -  2,  4.  5.  8 

•m  *  €■ 

Step  2.  Form  the  set  (T.,  containing 
as  elements  i  and  the  j's  found  in 
step  1.  Discard  any  set  which  is  en¬ 
tirely  contained  in  a  previous  set  or 
contains  less  than  4  elements. 

0-  J  «  (1.  2,  4.  5.  8) 

Application  of  Steps  I  and  Z  to  the  numerical  data  of  Figure  1  results  in  the 
following  (T  ^  • 


II 

(1.  2.  4.  5.  8)  . 

*■2  = 

(1.  2,  3.  4.  5.  6.  7)  . 

Step  3. 
the  O' 
k  ,  in  ‘ 


Procedure 


Example 


Order  the  elements  in  each  of 
so  that  for  every  pair,  j  and 
0*  ^  for  which 


<r  ^  .  (1.  2. 3,  4. 5. 
After  ordering 


6.  7)  . 


£  . 


j  and  k  are  on  different  sides  of  everyl 

JL  i  for  which  I 


and 


*ki 


i  e. 
€ 


0-^  S  (1,  4)  (2.  5)  (3,  6.  7)  . 

where  elements  within  the  same  par¬ 
enthesis  may  be  reordered  at  will. 
Note  that  if  j  •  1  t  k  »  3,  s^.  ■  2.  4, 
so  that  3;  the  only  / for  which 

both  s^  and  s^^  exceed  3  is 
/  s  2  and  /  «  5;  thus,  both  2  and 

5  must  be  placed  between  1  and  3. 
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Procedure 

Example 

Step  4.  Combine  the  partly  ordered 
sets,  and  apply  the  requirement  of 

Combining  d*.  and  (T  results 

in  *  ^ 

step  3. 

(8)(1,  4) (2,  5) (3,  6.  7). 

After  applying  the  requirement  of  step 
3  ,  this  becomes 

(1.4)  (8)  (2.  5)  (7)  (3.  6). 

The  association  matrix  with  rows  and  columns  properly  reorded  is  shown  in 
Figure  2. 
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Th«  clastora  and  thair  interrelationships  are  apparent  at  a  glance  from 
Figure  2.  If  one  adheres  to  the  definition  strictly,  there  are  three  clusters 
for  this  example.  However,  for  operational  purposes  the  two  important 
clusters  are  (1,  4,  8,  2,  5)  and  (2,  5,  7,  3,  6)  .  It  should  be  en4>hasised  that 
the  procedure  outlined  earlier  does  not  merely  find  clusters.  In  fact,  for 
pure  enumeration  of  clusters  there  may  well  be  more  efficient  procedures. 

In  the  process  of  finding  the  clusters,  it  has  been  possible  to  display  the  re¬ 
lationship  among  clusters  in  a  succint  manner. 

The  matrix-permutation  procedure  was  based  on  several  assumptions 
concerning  the  nature  of  the  population  (16)  . 

(1)  Clustering  to  a  large  extent  exists  among  the  members  of  the 
population. 

(2)  The  clusters  are  either  isolated  or  overlap  in  a  simple  way.  This 
assumption  is  equivalent  to  the  hypothesis  that  the  rows  and  columns  of  matrix 

S  can  be  permuted  to  have  a  structure  as  shown  in  Figure  3  ,  where  every 
entry  in  the  shaded  area  (principal  submatrices)  is  greater  or  equal  to  the 
threshold  ^  ,  and  every  entry  in  the  unshaded  area  is  less  than  £ 

This  assumption  implies  that  overlaps  such  as  those  shown  in  Figure 


4  do  not  occur. 
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This  assumption  is  only  approximately  valid  in  practice.  That  is,  the 
resultant  matrix  will  have  structure  like  the  one  shown  in  Figure  3,  but  will 
have  entries  greater  than  C  in  the  unshaded  areas. 

There  are  two  parameters  in  the  procedure.  One  of  these,  C  , 
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defincs  and  controls  the  clusters  that  are  obtained.  The  other  parameters, 

,  i*  used  to  Initiate  the  procedure,  and  should  not  affect  the  final 
clusters  that  are  found.  A  number  of  ways  of  choosing  these  paramsters, 
in  a  given  problem  are  being  investigated.  One  idea  is  to  let  €  and  € 
be  constant  multiples  of  the  average  association,  i.  e. , 


The  constants  and  are  to  be  experimentally  determined  once  for 
all,  and  would  not  vary  from  problem  to  problem. 


5.  Conclusion. 


Of  theKloeople  tested,  substantial  clusterinjg|^ji[a«'*fSund  to  exist 
for  15  people.  These  ISs^^e^le  torme^j600€^  It  is  of  interest  to 


note  that  the  interests 


its  of^^^sn^^e^V^il^teth  clusters  are  primarily  in  the 
tph^ics,  chernistryr^metMl^rgyf  etc.).  The  failure  of 


physical 

the  other  participants  to  cluster  is  probably  due  to  the’^hi^i^icient  number 
of  people  in  each  specialty.  - - 


J 


During  the  initial  4  week8«  the  system  was  in  a  testing  phase.  In  ^  ^ 

order  to  obtain  a  substantial  amount  of  data  in  a  short  period  of  time*  normal 
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operating  procedures  were  deviated  from  in  two  important  aspects*  First, 
items  were  distributed  to  members  of  both  clusters  regardless  of  the  source 
of  the  item.  During  normal  operation  the  distribution  will  be  confined  to  the 
cluster  from  which  the  item  is  initiated.  Secondly,  no  attempt  was  made  to 
confine  the  exchange  to  only  rare  items.  At  the  end  of  the  testing  period,  the 
participants  were  informed  that  only  items  of  unusual  interest  and  from  un¬ 
usual  sources  should  be  reported,  in  conformance  with  the  primary  aim  of 
the  system. 

During  the  four-week  testing  period,  a  total  of  41  items  were  initiated 
and  distributed.  Only  one  item  failed  to  evoke  any  favorable  response,  i.  e. , 
interested.  Of  the  41  items,  35  were  initiated  by  members  of  cluster  H  1 
and  6  by  members  of  cluster  #2  •  The  acceptance  rates  are  shown  in  Table  3. 


Cluster 

Average  Percent  of  Acceptance  per  Person 

Items  initiated  from 
within  the  cluster 

Items  initiated  from 
without  the  cluster 

#  1 

18.5 

4.  1 

m 

46.  5 

5.  1 

Table  3. 
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The  figures  of  18.5%  and  46.5%  represent  approxinruitely  improvements 
in  acceptance  rate  of  five*-fold  and  nine-fold  respectively.  The  improvement 
ratio  can  be  compared  with  the  average  association  factor  for  the  two  clusters* 
the  association  factor  having  precisely  the  interpretation  of  expected  improve¬ 
ment  ratio.  The  comparison  is  shown  in  Table  4. 


Cluster 

Average  Associa¬ 
tion  Factor 

Imp  rove  ment 
ratio 

1 

4.  66 

4.5 

2 

4.95 

9.  1 

Table  4 


The  agreement  for  cluster  #1  is  obviously  good.  The  deviation  from  agreement 
for  cluster  #2  is  probably  due  to  the  small  sample  (6  items  initiated  from  cluster 
#2). 


Although  the  limited  data  obtained  thus  far  does  not  admit  general  con¬ 
clusions*  the  aim  of  efCa^ive  discovery  and  dissenntnation  of  new  items  of 
information  through  exchange  a^)Mq[^'members  of  the  clusters  appears  to  be 
substantiated.  The  system  is  being  clo^^^^-^iQonitored  for  improvements  of 
both  its  operations  and  the  basic  mathematical  mod(^7  .Work  is  also  being 
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unde expand  the  system  to  include  more  partjM^nts,  which  should 
lead  to  s  hxrgerdeq|ter  of  more  specialise^^efMters.  Each  cluster  would 
have  more  xxMmbers  of  nno>«k^ospJi^^milar  interests,  and  this  should  in- 
the  ^proba.bility  tha^dny  memfihs;^  referred  to  information  of  value 
to  him.  HcaopefuLly^^^lfeing  assured  to  some  e«%bit^,tl^  what  he  should  know 
will  be  poltant|f^c>ut  to  him,  he  need  not  feel  the  necessily^o  read  as  much  of 

-  X 

the  avaittb^ie  literature  as  keenly  as  he  does  now.  _  ^ , 


\ 
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